09/16/2004 THU 12:54 FAX 7145573347 



BSTZ CM 



iiooe 



Appl. Na 09/134,272 
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Reply to Office action of June 21, 2004 

Amendments to the Claims: 

This listing of claims will replace all prior versions, and listings, of claims in the application: 

Listing of Claims: 

Claim 1-3. (Canceled.) 

Claim 4- (Previously Presented) The method of claim 6, upon detennining that the sum 
is greater than the long-term averaged energy and before detennining the peak-to-mean 
likelihood ratio, the method further comprises: 

determining whether a difference between the long-temi averaged energy and the short- 
term averaged energy is less than a predetermined threshold; 

determining that the current audio frame represents voice if the difference is greater than 
the predetermined threshold; and 

continuing by determining the peak-to-mean likelihood ratio if the difference is less than 
the predetermined threshold. 

Claims. (Previously Presented) The method of claim 6, wherein the determining of the 
short-tenn averaged energy comprises: 

determining an energy, in decibels, of the current audio frame; 

determining a short-term averaged energy for a prior audio frame; and 

conducting a weighted average of the energy of the current audio frame and the short- 
term averaged energy for the prior audio frame. 

Claim 6. (Previously Presented) A method for enhancing voice activity detection 
comprising: 

determining a short-term averaged energy for a current audio frame; 
determining a long-term averaged energy for the ciurent audio frame; 
determining whether a sum of the short-term averaged energy and a factor is greater than 
the long-term averaged energy; 
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determming that the current audio frame represents silence if the sum is less than the 
long-term averaged energy, without necessitating a determination of the peak-to-mean likelihood 
ratio; 

determining a peak-to-mean likelihood ratio, the determining a peak-to-mean likelihood 
ratio comprises 

calculating an averaged peak-to-mean ratio for the current audio frame, 
determining a maximum averaged peak-to-mean ratio, 
determining a minimimi averaged peak-to-mean ratio, 

determining a difference between the maximum averaged peak-to-mean ratio and 
the averaged peak-to-mean ratio for the current audio frame, 

determining a difference between the maximum averaged peak-to-mean ratio and 
the minimum averaged peak-to-mean ratio, and 

conducting a ratio, a denominator of the ratio being the difference between flie 
maximimi averaged peak-to-mean ratio and the minimimi averaged peak-to-mean ratio, 
the numerator being the difference between the maximum averaged peak-to-mean ratio 
and the averaged peak-to-mean ratio; and 

comparing the peak-to-mean likelihood ratio to a selected threshold to determine whether 
the current audio frame represents a voice signal. 

Claim?. (Canceled.) 

Claims. (Canceled.) 

Claim 9. (Previously Presented) The communication module of claim 12, whereio the 
voice activity detector, when executed, controls the processing xmit to determine whether a 
difference between the long-term averaged energy and the short-term averaged energy is less 
than a predetermined threshold, and to signal that the current audio frame represents voice if the 
difference is greater than the predetennined threshold, 

Claun 10. (Canceled.) 

Docket No: 003239.P010 Page 3 of 14 WWS/NDN/tn 

PAGE 7/18 • RCVD AT 9/16/2004 4:54:55 PM [Eastern Daylight Time] • 8VR:USPTO-EFXRF-1/0 • DNI3:8729306 • CSID:71 45573347 • DURATION (mm-ss):07-38 



09/16/2004 THU 12:55 FAX 7145573347 



BSTZ CM 



@008 



Appl.No. 09/134,272 

Amdt Dated September 16, 2004 

Reply to Office action of June 21, 2004 

Claim 1 1 . (Previously Presented) The communication module of claim 9, wherein the 
voice activity detector, when executed, controls the processing unit to determine a peak-to-mean 
ratio by (i) sampling an analog signal a predetemiined number of times to produce a plurahty of 
sampled signals each having a sampled value, (ii) determining a maximum value of the plurality 
of sampled signals, and (iii) conducting a ratio between an absolute value of the maximum value 
and a summation of the sampled values for the plurality of sampled signals. 

Claim 12. (Previously Presented) A commimication module comprising: 
a substrate; 

a processing unit placed on the substrate; and 

a memory coupled to the processing unit, the memory to contain a voice activity detector 
which, when executed, controls the processing unit to 

determine whether a sum of a short-term averaged energy and a predetermined 
fector is greater than a long-term averaged energy, and to signal that a cxirrent audio 
frame represents silence if the sum is less than the long-term averaged energy, and 

if the current audio frame is not determined to be silence using the short-term 
averaged energy and the long-term avemged energy, determine a peak-to-mean 
likelihood ratio for the current audio frame by (i) monitoring a maximum averaged peak- 
to-mean ratio and a minimum averaged peak-to-mean ratio, (ii) determining a first result 
being a difference between the maximum averaged peak-to-mean ratio and the averaged 
peak-to-mean ratio for the current audio frame, (iii) determining a second result being a 
difference between the maximum averaged peak-to-mean ratio and the minimum 
averaged peak-to-mean ratio, and (iv) conducting a ratio between the first result as a 
numerator and the second result as a denominator, and comparing the peak-to-mean 
likelihood ratio to a selected threshold to determine whether the current audio 6c3mc 
represents a voice signal. 

Claim 13. (Canceled.) 

Claim 14. (Canceled.) 
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Claim 15. (Previously Presented) A machine readable medium having embodied thereon 
a computer program for processing by a machine, the computer program comprising: 

a first routine for determining a normalized peak-to-mean likehhood ratio including (i) a 
denominator having a value substantially equal to a difference between a maximimi averaged 
peak-to-mean ratio and a minimum averaged peak-to-mean ratio and (ii) a numerator having a 
value substantially equal to a difference between the maximum averaged peak-to-mean ratio and 
the averaged peak-to-mean ratio; 

a second routine for comparing the peak-to-mean likelihood ratio to a selected threshold 
to determine whether a ciurent audio frame being transmitted represents a voice signal; 

a third routine for determining a short-term averaged energy for successive audio frames 
including the current audio frame, the third routine being executed before the first and second 
routines; 

a fourth routine for determining a long-term averaged energy for the current audio frame, 
the fourth routine being executed before the first and second routines; 

a fifth routine for determining whether a sum of the short-term averaged energy and a 
predetermined factor is greater than the long-term averaged energy, the fifth routine being 
executed before the first and second routines; and 

a sixth routine for determining whether a difference between the long-term averaged 
energy and the short-term averaged energy is less than a predetermined threshold, the sixth 
routine being executed after determining that the sum is greater than the long-term averaged 
energy and before execution of the first and second routines. 

Claim 16, (Original) The machine readable medixmi of claim 15, wherein the fifth 
routine determining that the current audio frame represents silence if the simi is less than the 
long-term averaged energy. 

Claim 17. (Original) The machine readable medium of claim 15, wherein the sixth 
routine determining that the current audio frame represents voice if the difference is greater than 
the predetermined threshold. 

Claim 18-21. (Canceled.) 
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Claim 22. (Previously Presented) A method for enhancing voice activity detection 
comprising: 

determining a short-term averaged energy for a current audio frame; 

determining a long-term averaged energy for the current audio frame; 

determining whether a sum of the short-term averaged energy and a factor is greater than 
the long-term averaged energy; 

determining that the current audio frame represents silence if the sum is less than the 
long-term averaged energy, without necessitating a determination of the peak-to-mean likelihood 
ratio; 

determining a peak-to-mean likelihood ratio including (i) a denominator having a value 
substantially equal to a difference between a maximum averaged peak-to-mean ratio and a 
minimum averaged peak-to-mean ratio and (ii) a nimierator having a value substantially equal to 
a difference between the maximum averaged peak-to-mean ratio and the averaged peak-to-mean 
ratio; and 

comparing the peak-to-mean likeUhood ratio to a selected threshold to determine whether 
a current audio frame represents a voice signal. 

Claim 23. (Previously Presented) The method of claim 22, upon determining that the 
sum is greater than the long-term averaged energy and before determining the peak-to-mean 
likelihood ratio, the method further comprises: 

deterrniiiing whether a difference between the long-term averaged energy and the short- 
term averaged energy is less than a predetermined threshold; 

determining that the current audio frame represents voice if the difference is greater than 
the predetermined threshold; and 

continuing by determining the peak-to-mean likelihood ratio if the difference is less than 
the predetermined threshold. 

Claim 24. (Previously Presented) The method of claim 22, wherein the determining of 
the short-term averaged energy comprises: 

determining an energy, in decibels, of the current audio frame; 
determining a short-term averaged energy for a prior audio frame; and 
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conducting a weighted average of the energy of the current audio frame and the short- 
term averaged energy for the prior audio frame. 

Claim 25, (Previously Presented) The method of claim 6, wherein the short-term 
averaged energy is an accumulation of signal energy associated with successive audio frames 
including the current audio frame. 

Claim 26. (Previously Presented) The method of claim 25, wherein the successive audio 
frames are pulse code modulation (PCM) audio frames. 

Claim 27- (Previously Presented) The method of claim 25, wherein the long-tenn 
averaged energy is based on the accumulation of the signal energy and a background noise level. 

Claim 28. (Previously Presented) The method of claim 6, wherein the short-term 
averaged energy is based on a current frame entry and a prior short-term averaged energy value. 

Claim 29. (Previously Presented) The method of claim 6, wherein the factor is at least 
two decibels. 

Claim 30. (Previously Presented) The communication module of claim 12, wherein the 
short-term averaged energy determined by the voice activity detector is an accumulation of 
signal energy associated with the successive audio frames being pulse code modulation (PCM) 
audio frames. 

Claim 31. (Previously Presented) The commxmication module of claim 30, wherein the 
long-term averaged energy determined by the voice activity detector is based on the 
accumulation of the signal energy and a background noise level. 

Claim 32. (Previously Presented) The communication module of claim 12, wherein the 
predetermined factor is at least two decibels. 

Claim 33. (Previously Presented) The software readable mediimi of claim 15, wherein 
the short-term averaged energy determined by the third routine is an accumulation of signal 
energy associated with the successive audio frames. 
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Claim 34. (Previously Presented) The software readable medium of claim 33, wherein 
the long-term averaged energy determined by the fourth routine is based on the accumulation of 
the signal energy and a background noise level. 

Claim 35, (Previously Presented) The method of claim 22, wherein the short-term 
averaged energy is an accumulation of signal energy associated with successive audio ftames 
including the current audio frame. 

Claim 36. (Previously Presented) The method of claim 22, wherein the short-term 
averaged energy is based on the current audio frame and a prior short-term averaged energy 
value. 

Claim 37. (Previously Presented) The method of claim 22, wherein the factor is at least 
two decibels. 
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